Engineering posts about Neural Networks
Curated summaries and key learnings for engineers working with Neural Networks.
Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use
This article discusses the redesign of a user-sequence platform aimed at improving the efficiency, speed, and usability of user data for machine learning applications. It addresses the challenges...
How Salesforce Built an AI Security Agent for Autonomous Threat Triage
The article outlines how Salesforce developed the SATA agent, an AI-driven system designed to enhance cybersecurity by autonomously triaging threats across complex environments. It highlights the...
Creating a Multi-Tenant AI Agent Platform Handling 7K+ Sessions Without Cross-Team Interference
The article outlines the development of the Bring Your Own Planner (BYOP), a multi-tenant AI agent platform designed to enhance team autonomy and scalability within Salesforce. It addresses the...
Reel Friends: Building Social Discovery that Scales to Billions
In the Meta Tech Podcast episode featuring Pascal Hartig, the engineering intricacies behind the 'Friend Bubbles' feature of Facebook Reels are explored. The discussion highlights the evolution of...
Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models
The article presents a novel approach to enhancing ad relevance by integrating real-time context into sequential recommender models. It highlights the limitations of previous models that relied...
Pushing the Frontier for Data Agents with Genie
The article presents Genie, a sophisticated data agent developed by Databricks, designed to enhance the analysis of both structured and unstructured enterprise data. It highlights the challenges...
Text-Conditional JEPA for Learning Semantically Rich Visual Representations
The article introduces Text-Conditional JEPA (TC-JEPA), a new framework for learning semantically rich visual representations by leveraging image captions to modulate predicted features. This...
What Matters in Practical Learned Image Compression
The article presents a comprehensive study on learned image compression codecs, emphasizing their optimization for the human visual system. It highlights the development of a new codec that...
Normalizing Flows with Iterative Denoising
The article presents advancements in Normalizing Flows (NFs) through the introduction of iterative TARFlow (iTARFlow), a generative model that combines autoregressive generation with iterative...
SpecMD: A Comprehensive Study on Speculative Expert Prefetching
The article presents SpecMD, a standardized framework designed for benchmarking caching strategies in Mixture-of-Experts (MoE) models. It highlights the importance of an expert caching mechanism to...
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing
The article discusses a novel approach to Key-Value (KV) caching in transformer language models, focusing on reducing memory footprint while maintaining high throughput during autoregressive...
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
The article introduces PORTool, an importance-aware policy optimization algorithm designed for multi-tool-integrated reasoning in large language model (LLM) empowered agents. It addresses the...
How AI-Driven Kubernetes Optimization Reclaimed Millions from 47% Idle Capacity
The article discusses Salesforce's challenges with infrastructure scaling on its Hyperforce platform, particularly regarding over-provisioning and idle capacity in Kubernetes services. It introduces...
Bootstrapping Sign Language Annotations with Sign Language Models
The article presents a novel approach to enhance sign language annotation through machine learning techniques. It outlines the limitations of current datasets and introduces a pseudo-annotation...
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows
The article introduces STARFlow-V, a novel video generative model that leverages normalizing flows for end-to-end likelihood-based generation. Unlike conventional diffusion-based models, STARFlow-V...
Databricks and Stripe Projects: Infrastructure Built for Agents
The article outlines the collaboration between Databricks and Stripe Projects to enhance AI development through agentic provisioning of Neon databases. It highlights the challenges of manual...
Built In, Not Bolted On: What AI-Native Actually Means in Cybersecurity
The article explores the paradigm of AI-native applications in cybersecurity, emphasizing the importance of integrating AI capabilities directly into the core architecture of security solutions...
StereoFoley: Object-Aware Stereo Audio Generation from Video
StereoFoley is a novel framework developed for generating stereo audio from video content, achieving high fidelity in semantic alignment and temporal synchronization. The framework addresses the...
Local Mechanisms of Compositional Generalization in Conditional Diffusion
The article explores the local mechanisms of compositional generalization in conditional diffusion models, emphasizing their ability to generate samples for out-of-distribution combinations of...
From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest
The article discusses Pinterest's development of a shopping conversion candidate generation model aimed at optimizing offsite conversion events, which are typically sparse and noisy. It details the...